3 Sinusoidal Model

1 Sinusoidal Model

Goal: to fit a simple sinusoidal[1] model to the sunspots data.

Let's consider the expression of sinusoidal model: s(t)=Acos(2πft)+Bsin(2πft),A=Rcosϕ,B=Rsinϕ.
For us, usually t=1,2,,n. In this case, we can restrict the frequency f[0,12].[2]

Based on the remark,

Fact

For every f,ϕ, there exists f0[0,12],ϕ0, s.t. s(t;f,ϕ)=s(t,f0,ϕ0), tN.

When f=0, s(t) is constant; when f=12, s(t)=Rcos(πt+ϕ)=(1)tRcos(ϕ).

Sinusoidal Model

yt=β0+β1cos2πft+β2sin2πft+εt,εti.i.dN(0,σ2).
The parameters here are β0,β1,β2,σ,f.

If f is known, it's a linear model. Otherwise it's a nonlinear regression model.

2 Frequentist Inference

Like in linear models, frequentist calculate the MLE: i=1n12πσexp[(ytβ0β1cos2πftβ2sin2πft)22σ2]σnexp[12σ2t=1n(ytβ0β1cos2πftβ2sin2πft)2]=σnexp[12σ2S(β0,β1,β2,f)].
So the problem is to find (β^0,β^1,β^2,f^) that minimizes S(β0,β1,β2,f)=t=1n(ytβ0β1cos2πftβ2sin2πft)2(2.1)=||yXfβ||2. We first fix f, so this is just a linear regression. We have β^(f)=(XfTXf)1XfTy,Xf=[1cos2πf(1)sin2πf(1)1cos2πf(n)sin2πf(n)]. Then f^=argminfS(β^(f),f),β=(β0,β1,β2)T.

3 Bayesian Inference

The posterior is σnexp[S(β,f)2σ2]1{0f12}, and assume prior β0,β1,β2,logσi.i.dUniform[C,C], fUniform[0,12].
Then[3] fβ,σ,f|data=σn1exp[S(β,f)2σ2]1{0f12}1{C<β0,β1,β2,logσ<C}, then the posterior density of f σn1exp[S(β,f)2σ2]dβdσ.

By Pythagorean identity, S(β,f)=||yXfβ||2=S(β^f,f)+(ββ^f)TXfTXf(ββ^f).
So further on σn1exp(S(β^f,f)2σ2)[(ββ^f)TXfTXf(ββ^f)2σ2]dβdσ=σn1exp(S(β^f,f)2σ2)(2π)p2det(σ2(XfTXf)1)σn1exp(S(β^f,f)2σ2)σp|XfTXf|12dσ=|XfTXf|12σn+p1exp(S(β^f,f)2σ2)dσ=0tn+p1(S(β^f,f))np2exp(12t2)dt(S(β^f,f))np2|XfTXf|12.
Compare with linear regression: Posterior(S(β))n2.

4 Fourier Frequency

RSS(f) (S above measures how well the sinusoid at frequency f fits the data). So we should take a grid of values for f, and then compute RSS(f) for each grid point.

Common Choice of Grid for f: (notate as G)

Note that n is the size of data.

Fourier Frequency

Fourier frequency is frequency f s.t. nf is an integer.

So our common choice inside G are all Fourier frequencies lying in [0,12].

When fG, plug in β^f=(XfTXf)1XtTy, RSS(f)=||yXfβ^f||2=(yXfβ^f)T(yXfβ^f)=yTyβ^fTXfTyyTXfβ^f+β^fTXfTXfβ^f=yTyyT(XfTXf)1XfTyyTXf(XfTXf)1XfTy+yTXf(XfTXf)1XfTXf(XfTXf)1XfTy=yTyyTXf(XfTXf)1XfTy.
And XfTXf=[11cos2πf1cos2πfnsin2πf1sin2πfn][1cos2πf1sin2πf11cos2πfnsin2πfn]=[nt=1ncos(2πft)t=1nsin(2πft)t=1ncos2(2πft)t=1ncos(2πft)sin(2πft)t=1nsin22πft]. Next let r=e2πift, [4]t=1ncos(2πft)=t=1ne2πift+e2πift2=r=e2πif12t=1nrt+12t=1nrt=12e2πife2πif1(e2πif1)+12e2πife2πif1(e2πif1)=0.

Next similarly t=1ncos2(2πft)=t=1n1+cos(4πft)2=n2,t=1ncos(2πft)sin(2πft)=12t=1nsin(4πft)=0.

One more fact

f1,f2 are two distinct Fourier frequencies. Then t=0n1cos(2πf1t)sin(2πf2t)=0,t=0n1sin(2πf1t)sin(2πf2t)=0,t=0n1sin(2πf1t)cos(2πf2t)=0.

So XfTXf=[n000n2000n2],(XfTXf)1=[1n0002n0002n]. Then RSS(f)=yTy(t=1nytt=1nytcos2πftt=1nytsin2πft)[1n0002n0002n](t=1nytt=1nytcos2πftt=1nytsin2πft)=yTy1n(t=1nyt)22n(t=1nytcos2πft)22n(t=1nytsin2πft)2=t=1n(yty)22n(t=1nytcos2πft)22n(t=1nytsin2πft)2. Note again that this is only true when f[0,12] and f is Fourier frequency.

Periodogram

Define periodogram as I(f)=1n[(t=1nytcos2πft)2+(t=1nytsin2πft)2].

So RSS(f)=t=1n(yty)22I(f).


We can also rewrite I(f)=1n|t=1nyt(cos2πft+isin2πft)|2=1n|t=1nyte2πift|2. This is exactly a Fourier transformation.

5 Some Other Nonlinear Regression Models

  1. yt=β0+β1t+β2cos(2πft)+β3sin(2πft)+εt. RSS(f)=t=1n(ytβ0β1tβ2cos(2πft)β3sin(2πft))2.
  2. (Broken stick / change of slope) yt=β0+β1t+β2(ts)++εt, here (ts)+=max{ts,0}.

  1. (正弦曲线的) s(t)=Rcos(2πft+ϕ). Here R is called the amplitude (振幅), ϕ is called phase (相位), f is the frequency (频率). The period here is 1f, and 2πf is angular frequencing ↩︎

  2. Because t is an integer, and we are looking at s(t)=Rcos(2πft+ϕ). If say f=3.5, then s(t)=Rcos(2π(3.5)tϕ)=Rcos(6πt+2π(0.5)tϕ)=Rcos(2π(0.5)tϕ). ↩︎

  3. Here the first f means the likelihood function, and the subscripted f means frequency ↩︎

  4. Last equation 0 is because f is a Fourier frequency, then nfZ. ↩︎